318        Bioinformatics

The assemblies’ evaluation report contains important statistics that reflect the quality of

the assemblies in the file. Figure 8.4 shows partial evaluation report of the three samples.

The colored heatmap indicates the quality from the worst (red) to the best (blue). On the

top, there are links that can take to each sample graph as shown in Figure 8.5. The graphs

show the key identified bacterial taxa and their abundance. Refer to the program users’

manual, which is available at “http://cab.cc.spbu.ru/quast/manual.html”, to read more

about the program use and the different report sections, refer to Chapter 3 to read more

about the de novo assembly evaluation metrics.

8.2.6  Mapping Reads to the Assemblies

We have already created the metagenomic FASTQ files, which are produced from the reads

unmapped to the host reference genome. Then, we used a de novo assembler to produce an

assembly (scaffolds.fasta) for each sample that may contain genomic sequences of several

microbes. In this step, we will use an aligner to map the reads in the FASTQ files to the

respective assembly. For this purpose, we can use Bowtie2 aligner. First, we need to build

an index for the “scaffolds.fasta” for each sample and then we will use it to align the reads

in FASTQ files. Now, let us create a directory named “assemblies” in the main project

directory and copy scaffolds FASTA files from the three sample directories into this new

directory with new file names as follows:

FIGURE 8.4  An assemblies’ evaluation report generated with metaquast.py.